Sentiment detection in micro-blogs using unsupervised chunk extraction

نویسندگان

  • Pierre Magistry
  • Shu-Kai Hsieh
  • Yu-Yun Chang
چکیده

*Correspondence: [email protected] Graduate Institute of Linguistics, National Taiwan University, Taipei City, Taiwan Abstract In this paper, we present a proposed system designed for sentiment detection for micro-blog data in Chinese. Our system surprisingly benefits from the lack of word boundary in Chinese writing system and shifts the focus directly to larger and more relevant chunks. We use an unsupervised Chinese word segmentation system and binomial test to extract specific and endogenous lexicon chunks from the training corpus. We combine the lexicon chunks with other external resources to train a maximum entropy model for document classification. With this method, we obtained an averaged F1 score of 87.2 which outperforms the state-of-the-art approach based on the released data in the second SocialNLP shared task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WeFeelFine as Resource for Unsupervised Polarity Classification

This papers shows the results obtained by a non supervised method in the task of sentiment polarity detection on micro-blogs. This method does not need of training, but it also is self-constructed from millions of publications on the web. The results show the effectiveness of the proposal, openining a new way of facing sentiment analysis in micro-blogs.

متن کامل

Impact of Feature Selection on Micro-Text Classification

Social media datasets – especially TwiŠer tweets – are popular in the €eld of text classi€cation. Tweets are a valuable source of microtext (sometimes referred to as “micro-blogs”), and have been studied in domains such as sentiment analysis, recommendation systems, spam detection, clustering, among others [6]. Tweets o‰en include keywords referred to as “Hashtags” that can be used as labels fo...

متن کامل

Micro-blogging Sentiment Analysis Using Bayesian Classification Methods

In this project I address the problem of accurately classifying the sentiment in posts from micro-blogs such as Twitter. As Twitter gains popularity, it becomes more useful to analyze trends and sentiment of its users towards various topics. Determining the general attitude of users towards a product or service, for example, can help a business measure overall consumer attitudes and customer sa...

متن کامل

An Investigation of Recursive Auto-associative Memory in Sentiment Detection

The rise of blogs, forums, social networks and review websites in recent years has provided very accessible and convenient platforms for people to express thoughts, views or attitudes about topics of interest. In order to collect and analyse opinionated content on the Internet, various sentiment detection techniques have been developed based on an integration of part-of-speech tagging, negation...

متن کامل

Feature Extraction from Micro-blogs for Comparison of Products and Services

Social networks are a popular place for people to express their opinions about products and services. One question would be that for two similar products (e.g., two different brands of mobile phones), can we make them comparable to each other? In this paper, we show our system namely OpinionAnalyzer, a novel social network analyser designed to collect opinions from Twitter micro-blogs about two...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016